6. Reactor

(c) 2019, Dr. Ramil Nugmanov; Dr. Timur Madzhidov; Ravil Mukhametgaleev

Installation instructions of CGRtools package information and tutorial's files see on https://github.com/cimm-kzn/CGRtools

NOTE: Tutorial should be performed sequentially from the start. Random cell running will lead to unexpected results.


In [ ]:
import pkg_resources
if pkg_resources.get_distribution('CGRtools').version.split('.')[:2] != ['3', '1']:
    print('WARNING. Tutorial was tested on 3.1 version of CGRtools')
else:
    print('Welcome!')

In [ ]:
# load data for tutorial
from pickle import load
from traceback import format_exc

with open('reactions.dat', 'rb') as f:
    reactions = load(f) # list of ReactionContainer objects

r1 = reactions[0] # reaction
m6 = r1.reactants[1]

Reactor objects stores single transformation and can apply it to molecules or CGRs.

Transformations is ReactionContainer object which in reactant side consist of query for matching group and in product side patch for updating matched atoms and bonds


In [ ]:
from CGRtools import CGRreactor   # import of Reactor
from CGRtools.containers import * # import of required objects

6.1. Products generation

Reactor works similar to ChemAxon Reactions enumeration.

Example here presents application of it to create esters from acids.

First we need to construct carboxy group matcher query. Then, ether group need to be specified.

Atom numbers in query and patch should be mapped to each other. The same atoms should have same numbers.


In [ ]:
acid = QueryContainer()                       # this query matches acids. Use construction possibilities.
acid.add_atom({'element': 'C', 'neighbors': 3})   # add carboxyl carbon. Hybridization is irrelevant here
acid.add_atom({'element': 'O', 'neighbors': 1})   # add hydroxyl oxygen. Hybridization is irrelevant here 
acid.add_atom('O')                                # add carbonyl oxygen. Number of neighbors is irrelevant here.
acid.add_bond(1, 2, 1) # create single bond between carbon and hydroxyl oxygen
acid.add_bond(1, 3, 2) # create double bond
print(acid)

In [ ]:
methyl_ester = MoleculeContainer()  # create patch - how acrboxyl group should be changed. We write methylated group
methyl_ester.add_atom('C', 1) # second argument is predefined atom mapping. Notice that mapping corresponds...  
methyl_ester.add_atom('O', 2) # ... to order in already created acid group. Atom 2 is released water.
methyl_ester.add_atom('O', 4)
methyl_ester.add_atom('O', 3)
methyl_ester.add_atom('C', 5)
methyl_ester.add_bond(1, 4, 1)
methyl_ester.add_bond(1, 3, 2)
methyl_ester.add_bond(4, 5, 1)
# No bond between atom 1 and atom 2. This bond will be broken. 
methyl_ester.calculate2d()
methyl_ester

In [ ]:
m6.reset_query_marks() # required for correct matching
m6 # acid

In [ ]:
template = ReactionContainer([acid], [methyl_ester]) # merge query and patch in template, which is ReactionContainer
reactor = CGRreactor(template)                        # Reactor is initialized
reacted_acid = reactor(m6)                            # application of Reactor to molecule

In [ ]:
reacted_acid.calculate2d(scale=2) # calculate coordinates
reacted_acid       # desired methylated ester have been generated

One can notice presence of separate oxygen (water) and ester group.

The second group can substituted by calling reactor on observed product.


In [ ]:
reacted_acid.reset_query_marks() # this is new molecule and query marks need to be set
second_stage = reactor(reacted_acid) # apply transformation on product of previous transformation
second_stage.calculate2d(scale=2) #  recalculate coordinates for correct drawing
second_stage

second_stage has 3 components in a single MoleculeContainer object. We can split it into individual molecules and place all molecules into ReactionContainer object. Since in CGRtools atom-to-atom mapping corresponds to numbering of atoms in molecules, the resulting product has AAM according to the rule applied. Thus, reaction has correct AAM and nothing special should be made to keep or find it.


In [ ]:
products = second_stage.split() # split product into individual molecules
react = ReactionContainer([m6], products) # unite reagent and product into reaction. 
react.fix_positions()
react

For multicomponent reactions one can merge molecules of reactants into single MoleculeContainer object and apply reactor on it.

It is possible to generate all available products in case that molecule has several groups matching the query.


In [ ]:
m6copy = m6.copy() # let's try to use molecule with several groups mathcing query
m6copy.atom(5).isotope = 13 # isotope mark is added to see the difference in products
m6copy.reset_query_marks() # query marks need to be calculated
enums = set()              # the set enums is used to select structurally diverse products
for m in reactor(m6copy, limit=0): # limit=0 is enumeration of all possible products by reactor
    print(m)                       # print signatures for observed molecules. Notice presence of water as component of product
    m.calculate2d(scale=2)         # recalculate coordinates
    enums.update(m.split())        # split product into separate molecules
enums = list(enums)                # set of all resulting molecules

In [ ]:
m6copy

Let's have a look at molecules in set:


In [ ]:
enums[0]

In [ ]:
enums[1]

In [ ]:
enums[2]

6.2. MetaReactions (reactions on CGRs).

Reactor could be applied to CGR to introduce changes into reaction.

6.2.1. Example of atom-to-atom mapping fixing.


In [ ]:
reactions[1] # reaction under study

In [ ]:
cgr = ~reactions[1] # generate reaction CGR
print(cgr)

In [ ]:
cgr.centers_list # reaction has two reaction centers. [10,11,12] - pseudo reaction appeared due to AAM error

reaction has AAM error in nitro-group

Lets try to use Reactor for AAM fixing


In [ ]:
nitro = QueryCGRContainer() # construct query for invalid reaction center - CGR of wrongly mapped nitro-group
nitro.add_atom({'element': 'N', 'charge': 1, 'p_charge': 1}) # atom 1
nitro.add_atom({'element': 'O', 'charge': 0, 'p_charge': -1}) # atom 2. Notice that due to AAM error charge was changed
nitro.add_atom({'element': 'O', 'charge': -1, 'p_charge': 0}) # atom 3. Notice that due to AAM error charge was changed
nitro.add_atom('C') # atom 4

nitro.add_bond(1, 2, {'order': 2, 'p_order': 1}) # bond between atoms 1 and 2. Due to AAM error bond is dynamic ('2>1' type) 
nitro.add_bond(1, 3, {'order': 1, 'p_order': 2}) # bond between atoms 1 and 3. Due to AAM error bond is dynamic ('1>2' type) 
nitro.add_bond(1, 4, 1) # ordinary bond
print(nitro)
# this query matches reaction center in CGR appeared due to AAM error.

In [ ]:
nitro < cgr # query matches CGR of reaction with error.

In [ ]:
valid_nitro = MoleculeContainer() # construct nitro group without dynamic atoms. Notice that atom order should correspond object nitro
valid_nitro.add_atom({'element': 'N', 'charge': 1}) # ordinary N atom
valid_nitro.add_atom({'element': 'O', 'charge': -1}) # ordinary negatively charged oxygen atom
valid_nitro.add_atom('O')                            # ordinary oxygen atom

valid_nitro.add_bond(1, 2, 1) # ordinary single bond
valid_nitro.add_bond(1, 3, 2) # ordinary double bond
print(valid_nitro)

In [ ]:
valid_nitro.calculate2d()
valid_nitro # this is correct representation of group.

Now time to prepare and apply Template to CGR based on reaction with incorrect AAM.

Template is Reaction container with query in reactants and patch in products


In [ ]:
template = ReactionContainer([nitro], [valid_nitro]) # template shows how wrong part of CGR is transformed into correct one.
print(template) # notice complex structure of query: CGR signature is given in braces, then >> and molecule signature

Reactor class accept single template. Existence of dynamic bond in it is not a problem.


In [ ]:
reactor = CGRreactor(template)

Reactor object is callable and accept as argument molecule or CGR.

NOTE: fixed is new CGR object


In [ ]:
fixed = reactor(cgr) # fix CGR

CGRreactor returns None if template could not be applied, otherwise patched structure is returned.


In [ ]:
print(fixed)

C-C(-I).O-C:1:C:C:C(:C:C:1)-[N+](-[O-])=O
C-C(.I)-O-C:1:C:C:C(:C:C:1)-[N+](-[O-])=O

One can see that nitro group has no dynamic bonds any more. CGR corresponds only to substitution.


In [ ]:
fixed.centers_list # reaction center appeared due to AAM error before does not exist. Only 1 reaction center is found

Here is depiction of observed CGR (external software was used). Notice absence of wrong reaction center.

6.3.2 Reaction transformation

Example of E2 to SN2 transformation.

E2 and SN2 are concurrent reactions. We can easily change reaction center of E2 reaction to SN2. It could be achieved by substitution of reaction center corresponding to double bond formation in E2 reaction by the one corresponding to formation of new single bond with base as in SN2.


In [ ]:
from CGRtools import CGRreactor, CGRpreparer
from CGRtools.containers import QueryCGRContainer, ReactionContainer
from CGRtools.files import MRVread, SDFwrite
from io import StringIO

In [ ]:
e2 = next(MRVread('e2.mrv')) # read E2 reaction from ChemAxon MRV file
e2

In [ ]:
# create CGR query for E2 reaction side
e2query = QueryCGRContainer() 
e2query.add_atom('C', 1) # create carbon with mapping number 1
e2query.add_atom('C', 2) # create carbon with mapping number 2
# addition of any halogen atom
e2query.add_atom({'element': ['I', 'Cl', 'Br'], 'neighbors': 1, 'p_neighbors': 0, 'charge': 0, 'p_charge': -1}, 3)
# addition of OH-, RO-, SH-, RS- groups
e2query.add_atom({'element': ['O', 'S'], 'neighbors': [0, 1], 'p_neighbors': [0, 1], 'charge': -1, 'p_charge': 0}, 4)

e2query.add_bond(1, 2, {'order': 1, 'p_order': 2}) # bond between two carbon corresponds to formation of double from single
e2query.add_bond(1, 3, {'order': 1, 'p_order': None}) # bond between carbon and halogen breaks in E2 reaction
print(e2query) # it is CGR of E2 reaction center

In [ ]:
e2_cgr = ~ e2 # compose reaction into CGR
e2_cgr.reset_query_marks() # prepare to isomorphism

CGR is the following (depicted by external software)


In [ ]:
e2query < e2_cgr # E2 CGR pattern works!

In [ ]:
# create patch creating SN2 reaction. Notice that ordering of atoms correspond to that of E2 CGR query
sn2patch = QueryCGRContainer()
sn2patch.add_atom({}, 1) # save atom unchanged. We don't specify atom type since it will be taken from E2 query
sn2patch.add_atom({}, 2) # it is central atom. We don't specify atom type since it will be taken from E2 query
sn2patch.add_atom({'charge': 0, 'p_charge': -1}, 3) # elements list with same order of elements [I, Cl, Br] could be used as well
sn2patch.add_atom({'charge': -1, 'p_charge': 0}, 4)

sn2patch.add_bond(1, 2, {'order': 1, 'p_order': 1}) # set carbon - carbon single bond that is unchanged in SN2 reaction
sn2patch.add_bond(1, 3, {'order': 1, 'p_order': None}) # this bond is broken in SN2 reaction
sn2patch.add_bond(1, 4, {'order': None, 'p_order': 1}) # it corresponds to formation of bond O(S)-C bond in SN2 reaction

In [ ]:
reactor = CGRreactor(ReactionContainer([e2query], [sn2patch])) # create template and pass it to Reactor
sn2_cgr = reactor(e2_cgr) # apply Reactor on E2 reaction

In [ ]:
print(sn2_cgr)

It is depiction of CGR produced by Reactor (external software is used)


In [ ]:
# decompose CGR into reaction
preparer = CGRpreparer() 
sn2 = preparer.decompose(sn2_cgr)

In [ ]:
sn2.calculate2d()
sn2 # reaction has the same reagents like E2 above, but products correspond to SN2 reaction